8 research outputs found

    Visual object detection from lifelogs using visual non-lifelog data

    Get PDF
    Limited by the challenge of insufficient training data, research into lifelog analysis, especially visual lifelogging, has not progressed as fast as expected. To advance research on object detection on visual lifelogs, this thesis builds a deep learning model to enhance visual lifelogs by utilizing other sources of visual (non-lifelog) data which is more readily available. By theoretical analysis and empirical validation, the first step of the thesis identifies the close connection and relation between lifelog images and non-lifelog images. Following that, the second phase employs a domain-adversarial convolutional neural network to trans- fer knowledge from the domain of visual non-lifelog data to the domain of visual lifelogs. In the end, the third section of this work considers the task of visual object detection of lifelog, which could be easily extended to other related lifelog tasks. One intended outcome of the study, on a theoretical level of lifelog research, is to iden- tify the relationship between visual non-lifelog data and visual lifelog data from the perspective of computer vision. On a practical point of view, a second intended outcome of the research is to demonstrate how to apply domain adaptation to enhance learning on visual lifelogs by transferring knowledge from visual non-lifelogs. Specifically, the thesis utilizes variants of convolutional neural networks. Furthermore, a third intended outcome contributes to the release of the corresponding visual non-lifelog dataset which corresponds to an existing visual lifelog one. Finally, another output from this research is the suggestion that visual object detection from lifelogs could be seamlessly used in other tasks on visual lifelogging

    Transfer nonnegative matrix factorization for image representation

    Get PDF
    Nonnegative Matrix Factorization (NMF) has received considerable attention due to its psychological and physiological interpretation of naturally occurring data whose representation may be parts based in the human brain. However, when labeled and unlabeled images are sampled from different distributions, they may be quantized into different basis vector space and represented in different coding vector space, which may lead to low representation fidelity. In this paper, we investigate how to extend NMF to cross-domain scenario. We accomplish this goal through TNMF - a novel semi-supervised transfer learning approach. Specifically, we aim to minimize the distribution divergence between labeled and unlabeled images, and incorporate this criterion into the objective function of NMF to construct new robust representations. Experiments show that TNMF outperforms state-of-the-art methods on real dataset

    Negative faceblurring: a privacy-by-design approach to visual lifelogging with Google Glass

    Get PDF
    Wearable devices such as Google Glass are receiving increasing attention and look set to become part of our technical landscape over the next few years. At the same time, lifelogging is a topic that is growing in popularity with a host of new devices on the market that visually capture life experience in an automated manner. We describe a visual lifelogging solution for Google Glass that is designed to capture life experience in rich visual detail, yet maintain the privacy of unknown bystanders

    Real-time behavioural analysis using google glass

    Get PDF
    Lifelogging is a form of pervasive computing that represents a phenomenon whereby people can digitally record their own daily lives in varying amounts of detail, for a variety of purposes. Lifelogging offers huge potential for supporting behaviour change because it can capture the totality of life experience and provide heretofore unknown levels of insight into the real-world activities of the lifelogger. In this paper we present a real-time curated lifelogging prototype that can support real-time behavioural analysis by supporting immediate feedback and intervention to the lifelogger

    Bridged Transformer for Vision and Point Cloud 3D Object Detection

    Full text link
    3D object detection is a crucial research topic in computer vision, which usually uses 3D point clouds as input in conventional setups. Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises. However, due to the heterogeneous geometrics of the 2D and 3D representations, it prevents us from applying off-the-shelf neural networks to achieve multimodal fusion. To that end, we propose Bridged Transformer (BrT), an end-to-end architecture for 3D object detection. BrT is simple and effective, which learns to identify 3D and 2D object bounding boxes from both points and image patches. A key element of BrT lies in the utilization of object queries for bridging 3D and 2D spaces, which unifies different sources of data representations in Transformer. We adopt a form of feature aggregation realized by point-to-patch projections which further strengthen the correlations between images and points. Moreover, BrT works seamlessly for fusing the point cloud with multi-view images. We experimentally show that BrT surpasses state-of-the-art methods on SUN RGB-D and ScanNetV2 datasets.Comment: CVPR 202

    Visual object detection from lifelogs using visual non-lifelog data

    No full text
    Limited by the challenge of insufficient training data, research into lifelog analysis, especially visual lifelogging, has not progressed as fast as expected. To advance research on object detection on visual lifelogs, this thesis builds a deep learning model to enhance visual lifelogs by utilizing other sources of visual (non-lifelog) data which is more readily available. By theoretical analysis and empirical validation, the first step of the thesis identifies the close connection and relation between lifelog images and non-lifelog images. Following that, the second phase employs a domain-adversarial convolutional neural network to trans- fer knowledge from the domain of visual non-lifelog data to the domain of visual lifelogs. In the end, the third section of this work considers the task of visual object detection of lifelog, which could be easily extended to other related lifelog tasks. One intended outcome of the study, on a theoretical level of lifelog research, is to iden- tify the relationship between visual non-lifelog data and visual lifelog data from the perspective of computer vision. On a practical point of view, a second intended outcome of the research is to demonstrate how to apply domain adaptation to enhance learning on visual lifelogs by transferring knowledge from visual non-lifelogs. Specifically, the thesis utilizes variants of convolutional neural networks. Furthermore, a third intended outcome contributes to the release of the corresponding visual non-lifelog dataset which corresponds to an existing visual lifelog one. Finally, another output from this research is the suggestion that visual object detection from lifelogs could be seamlessly used in other tasks on visual lifelogging

    Demographic attributes prediction using extreme learning machine

    No full text
    Demographic attributes prediction is fundamental and important in many applications in real world, such as: recommendation, personalized search and behavior targeting. Although a variety of subjects are involved with demographic attributes prediction, e.g. there are requirements to recognize and predict demography from psychology, but the traditional approach is dynamic modeling on specified field and distinctive datasets. However, dynamic modeling takes researchers a lot of time and energy, even if it is done, no one has an idea how good or how bad it is. To tackle the problems mentioned above, a framework is proposed in this chapter to predict using classifiers as core part, which consists of three main components: data processing, predicting using classifiers and prediction adjustments. The component of data processing performs to clean and format data. The first step is extracting relatively independent data from complicated original dataset. In the next step, the extracted data goes through different paths based on their types. And at the last step, all the data will be transformed into a demographic attributes matrix. To fulfill prediction, the demographic attributes matrix is taken as the input of classifiers, and the testing dataset comes from the same matrix as well. Classifiers in the experiments includes conventional state-of-the-art ones and Extreme Learning Machine, a new outstanding classifier. From the results of experiments based on two unique datasets, it is concluded ELM outperforms others. In the stage of prediction adjustments, two kinds of adjustments strategies are proposed corresponding to single target attributes and multiple target attributes separately, where single target attributes adjustments strategies include: adjusting the parameters of classifiers, adjusting the number of classes of target attributes and adjusting the public attributes. And multiple target attributes adjustment utilizes the outputs of first prediction as the inputs of second prediction to improve the accuracy of the first prediction. The framework proposed in this chapter consumes less time compared with traditional dynamic modeling methods, and there is no need to fully study the knowledge in various subjects for researchers using the framework because of the regular patterns. In addition, adjustment strategies have no restriction on the datasets; hence it will be useful universally. However, in some cases, dynamic modeling has the advantage of precision, resulting in better accuracy, but the results from the framework proposed in the chapter could provide as a comparison. In this work, a universal demographic attributes prediction framework is proposed to work on a variety of dataset with Extreme Learning Machine (ELM). The framework consists of three main components: First, processing raw data and extracting attribute features depending on different data types; Second, predicting desired attributes by classification; Third, improving the accuracy of classifiers through various adjustment strategies. Two experiments of different data types on real world prediction problems are conducted to demonstrate our framework can achieve better performance than other traditional state-of-the-art prediction methods with respect to accuracy. abstract environment

    MemLog, an enhanced Lifelog annotation and search tool

    Get PDF
    As of very recently, we have observed a convergence of technologies that have led to the emergence of lifelogging as a potentially pervasive technology with many real-world use cases. While it is becoming easier to gather massive lifelog data archives with wearable cameras and sensors, there are still challenges in developing effective retrieval systems. One such challenge is in gathering annotations to support user access or machine learning tasks in an effective and efficient manner. In this work, we demonstrate a web-based annotation system for sensory and visual lifelog data and show it in operation on a large archive of nearly 1 million lifelog images and 27 semantic concepts in 4 categories
    corecore